Optimal Shortcuts to Adiabatic Control by Lagrange Mechanics

We combined an inverse engineering technique based on Lagrange mechanics and optimal control theory to design an optimal trajectory that can transport a cartpole in a fast and stable way. For classical control, we used the relative displacement between the ball and the trolley as the controller to study the anharmonic effect of the cartpole. Under this constraint, we used the time minimization principle in optimal control theory to find the optimal trajectory, and the solution of time minimization is the bang-bang form, which ensures that the pendulum is in a vertical upward position at the initial and the final moments and oscillates in a small angle range.


Introduction
In the past few decades, with the development of experimental technology, the unstable control of nonlinear unstable systems [1][2][3][4][5] by a moving controller has attracted more and more attention. As far as we know, a sufficiently slow adiabatic process is a very simple way to achieve the ideal transfer, i.e., without excitations or losses [6][7][8][9]. However, a sufficiently slow process is problematic, firstly a sufficiently slow process, is impractically long and, secondly, over a long period of time, the ideal dynamics may be subject to the accumulation of random and/or uncontrollable perturbations that can undermine the expected results. In some experiments and theories, it is necessary to speed up this transport process but also to ensure that there is no excitation or loss at the initial and final moments [10][11][12][13][14][15][16][17]. To solve these problems, "Shortcuts To Adiabaticity" (STA) technology was proposed decades ago [18]. It achieves the same results as the adiabatic process in a short time [19]. At present, STA technology has been successfully applied to atoms, molecules, optical physics, solid state physics, chemistry, engineering, quantum systems, and classical systems [12,[20][21][22][23][24][25]. With the development of STA technology, the STA technique includes many different schemes, such as the quantum invariant-based inverse control method [26][27][28][29], the reverse thermal transfer compensation method [30][31][32], the no-jump quantum drive method [33][34][35], etc. These methods have been validated in a number of experiments.
Cartpole is a typical multivariate, higher-order, nonlinear, strongly coupled, naturally unstable system. The stable control of the cartpole system is a typical problem in control theory. Here, we used the inverse engineering method, based on Lagrange mechanics, for a fast but stable control of the cartpole. By fixing the boundary conditions at the initial and final times, we aimed to find the trajectories that meet the boundary conditions [19,36] but that also achieves transition-free evolution ideally in a shorter time. Later, this approach can be complemented by optimal control theory [37,38], where we used the time minimization criterion in the Pontryagin maximization principle. Pontryagin's maximum (or minimum) principle [39] is used in optimal control theory to find the best possible control for taking a dynamical system from one state to another, especially in the presence of constraints for the state or input controls. It was formulated in 1956 by the Russian mathematician Lev Semenovich Pontryagin and his students. The main thing is to find a controller so that the control Hamiltonian of the system reaches the maximum (or minimum) and then determines the optimal trajectory of the system. In this article, we used the relative displacement of the ball and the trolley horizontally as a controller, and then used the principle to find the optimal trajectory.

Physical Model
Here, we studied an ideal straight type cartpole system model; its schematic diagram and related parameters are shown in Figure 1. The trolley moves from x 0 = 0 and x 0 = d, swinging the pole as far as possible in a vertically upward position at the initial and final moments. Never a perfect harmonic in actual transmission, similar to the transport potential of approximate optical tweezers, the quartic anharmonic plays a very important role, as shown in Figure 1b. We aimed to combine the inverse engineering method [26][27][28][29] with the optimal control theory in the Pontryagin maximization principle [37,38]. By setting boundary conditions to suppress the final excitation, we used the relative distance between the ball and the trolley as a controller throughout the process, to achieve the time minimization. The physical model and related parameters are shown in Figure 1, where M indicates the mass of the trolley, l is the length of the non-elongated rod, m is the mass of the ball, θ is the angle between the pole and the vertical direction, x 0 represents the position of the trolley over time, and x represents the position of the ball. We assumed that the model is an ideal model by setting the following conditions: (i) the quality and friction of the rod can be ignored; (ii) point masses; (iii) constant rod length l; (iv) the trolley position is treated as a control parameter rather than a dynamical variable.

Hamiltonian of the System and Inverse Engineering
The solution of the cartpole system can be solved by Lagrange mechanics. According to the parameters provided in Figure 1, the Cartesian coordinate of mass in a rest frame are given by: where x 0 represents the position of the trolley. The kinetic energy and potential energy of the system can be expressed as follows: where the dots represent time derivatives. It can be obtained by substituting Equation (1) into Equation (2), Thus, the Lagrangian L = T − V of the system can be written as: In order to reverse engineer the trajectory of the trolley, we need to calculate the Hamiltonian of the system, according to p θ = ∂L ∂θ , where p θ is the momentum with θ as the generalized coordinate. So, we need the conjugate momentum of θ, Thus, the Hamiltonian H θ =θ p θ − L can be obtained from the Lagrangian : Using the canonical transformation approach, change the system from θ to q as the coordinate, and the Hamiltonian is rewritten as (please refer to Appendix A for details): where ω = g l is the natural frequency of the pendulum, p represents the momentum with q as the generalized coordinate, q = x 0 − x. According to the canonical transformation [40], we can obtain :q = p m , p = mω 2 q + mω 2 2l 2 q 3 + mẍ 0 , from which the dynamic equation of the system becomes This can be written as:ẍ To fix the distance, we assumed that the trolley is transported from We can set suitable boundary conditions for x(t) according to our expectations, and based on these boundary conditions the trolley trajectory x 0 (t) can be inverse engineered [26][27][28][29] to see how the position of the mass changes over time. We imposed the conditions as follows: In general, there are different options from which to choose the trajectories fulfilling the boundary conditions. For instance, one can choose the polynomial ansatz to interpolate the function. In the following section, we would like to complement it with time-optimal control for this problem.

Optimal Control for Time Minimization
In this section, we try to find the time minimization by using the Pontryagin maximum principle [37,38]. To be consistent, we set new notations for the Equation (11) of motion of the system: such that (where u(t) is the controller, representing the relative displacement of the ball to the trolley in the horizontal direction; x is the dynamic variable, representing the position of the ball) For this system, the boundary conditions for y 1 , y 2 and u(t) can be given by Equations (11) and (12), and u(t) is required to satisfy u(t) = 0 at t ≤ 0 and t ≥ t f . From the optimal control theory, one can find |u(t)| ≤ δ for a fixed bound δ and minimize the cost function J. The Pontryagin maximum principle finally provides the necessary conditions for optimal control [37,38].
For the relative displacement between the ball and the trolley, we utilized optimal control theory to find the time-optimal trajectory [38], so we defined the cost function as: In this manner, we defined the control Hamiltonian as: For almost all 0 ≤ t ≤ t f , the function H c (p(t), y(t), u) attains its maximum at u = u * and H c (p(t), y(t), u * ) = c, where c is constant, andẏ = f (y(t), u). According to the cost function, the control Hamiltonian can be written as: from which the canonical equations areṗ Accordingly, we can obtain the following costate equations by way of Equation (18): We can easily get p 1 = c 1 and p 2 = −c 1 t + c 2 , where c 1 , c 2 are constant. In order to maximize the control of the Hamiltonian, we can obtain the optimal u(t) as a form of bang-bang control, as shown in Figure 2, with only one intermediate switching time at t 1 . Since the controller u(t) is in the form of a switch, the trajectory of the trolley x 0 (t) changes abruptly at t = 0 + . By substituting u(t) into Equation (11) and combining the boundary conditions in Equation (12), we eventually obtain the trajectory of x(t), According to u(t) = x 0 − x, the trajectory of the trolley x 0 (t) can be represented as: Solving the system of Equations (14) and (15), we find the switching time t 1 and final time t f as follows: Again, the trajectory of the trolley and the trajectory of the swing ball are shown in Figure 3. Through the above control, the ball can be oscillated in a small angle range.
After finding the optimal trajectory, we brought it into the exact Equation (5) and approximate dynamic equations. Clearly, short times imply larger transient angles, as shown in Figure 4. In addition, for the long time, we can see that there is basically no big difference in the change of angle over time, but the difference is noticeable only for the smaller time, t f = 3 s. The main reason is that if we ignore the higher-order terms, the approximation deviates from the reality. At the same time, because the δ is the relative displacement between the ball and the trolley, the maximum value of which is the length of the pendulum, when the δ has a fixed value, the t f will have a lower bound. In other words, when δ reaches a certain value, the transmission time of the system is relatively small, the speed is relatively large, and the system may break down so that the control is not very good. According to the analysis of physical meaning, when δ = l, in this case, the pendulum is in a "lying flat" state, that is, in horizontal direction, the system basically breaks down completely. In short, we can determine whether the system breaks down based on the change in relative displacement with δ.

Conclusions
In conclusion, we applied reverse engineering based on Lagrange mechanics and time minimization in the Pontriagin maximization principle for the fast but stable control of nonlinear unstable systems, e.g., the cartpole. In such control, we took an anharmonic effect and analyzed the stability of control. By fixing the relative displacement between the position of the ball and the trolley as constraints, we applied optimal control theory to design a time-optimal trajectory, as compared to the case of harmonic approximation. In the future, we will work out the classical control of the inverted double pendulum, and compare it with the control of the cartpole. Of course, the machine learning, fed by the results of STA, can be an interesting area for exploration as well.  where ω = g l is the natural frequency of the pendulum. Usually, the term p 2 q 2 2ml 2 can be ignored for second-order terms, but not for quartic terms. Here, we can not perform another canonical transformation to cancel this term, so we have to compromise by ignoring it in the next calculation, keeping other feasible high-order terms, and calculate the the controller.
By numerical simulations of dynamics without any approximation, even though the p 2 q 2 2ml 2 term is neglected, the controller still has a satisfying performance. Somehow, we may conclude that, the effect of such term is not that significant.